Resolving Person Names in Web People Search
نویسندگان
چکیده
Disambiguating person names in a set of documents (such as a set of web pages returned in response to a person name) is a key task for the presentation of results and the automatic profiling of experts. With largely unstructured documents and an unknown number of people with the same name the problem presents many difficulties and challenges. This chapter treats the task of person name disambiguation as a document clustering problem, where it is assumed that the documents represent particular people. This leads to the person cluster hypothesis, which states that similar documents tend to represent the same person. Single Pass Clustering, k-Means Clustering, Agglomerative Clustering and Probabilistic Latent Semantic Analysis are employed and empirically evaluated in this context. On the SemEval 2007 Web People Search it is shown that the person cluster hypothesis holds reasonably well and that the Single Pass Clustering and Agglomerative Clustering methods provide the best performance.
منابع مشابه
Vitae and Map Display System for People on the Web
We present a system that displays a curriculum vitae with a map to understand people. Our method is based on the following processes: (1) creating curriculum vitae using related work [1], (2) extracting the names of places where the person studied and worked from the vitae, (3) getting such location information as latitudes, longitudes, and addresses from the place names using Google Maps API, ...
متن کاملSummarizing and Visualizing Web People Search Results
People search is one major search activity on the Web. If the list of people search results is merely “person 1, person 2, . . . and so on,” users have difficulty determining which person clusters they should select. In this paper, we present a project that summarizes and visualizes Web people search results to help users select person clusters more easily. We explore three ways of summarizing ...
متن کاملWhich Who are They? People Attribute Extraction and Disambiguation in Web Search Results∗
People name search often returns a lot of Web pages containing the strings of personal names. Due to namesake, extracting target person attributes (such as birthday, occupation, affiliation, nationality, contact information, etc.) is expected to be helpful to differentiate documents related to different people and thus group documents related to the same person. This paper presents the methodol...
متن کاملAssigning Location Information to Display Individuals on a Map for Web People Search Results
Distinguishing people with identical names is becoming more and more important in Web search. This research aims to display person icons on a map to help users select person clusters that are separated into different people from the result of person searches on the Web. We propose a method to assign person clusters with one piece of location information. Our method is comprised of two processes...
متن کاملUC3M_13: Disambiguation of Person Names Based on the Composition of Simple Bags of Typed Terms
This paper describes a system designed to disambiguate person names in a set of Web pages. In our approach Web documents are represented as different sets of features or terms of different types (bag of words, URLs, names and numbers). We apply Agglomerative Vector Space clustering that uses the similarity between pairs of analogous feature sets. This system achieved a value of 66% for Fα=0.2 a...
متن کامل